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Abstract 

Agent-based modeling and simulation is a useful method to study biological phe- 
nomena in a wide range of fields, from molecular biology to ecology. Since there is 
currently no agreed-upon standard way to specify such models it is not always easy to 
use published models. Also, since model descriptions are not usually given in math- 
ematical terms, it is difficult to bring mathematical analysis tools to bear, so that 
models are typically studied through simulation. In order to address this issue, Grimm 
et al. proposed a protocol for model specification, the so-called ODD protocol, which 
provides a standard way to describe models. This paper proposes an addition to the 
ODD protocol which allows the description of an agent-based model as a dynamical 
system, which provides access to computational and theoretical tools for its analysis. 
The mathematical framework is that of algebraic models, that is, time-discrete dynam- 
ical systems with algebraic structure. It is shown by way of several examples how this 
mathematical specification can help with model analysis. This mathematical frame- 
work can also accommodate other model types such as Boolean networks and the more 
general logical models, as well as Petri nets. 
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1 Introduction 



The arsenal of modeling tools in mathematical biology has grown to include a spectrum of 
methods beyond the traditional and very successful continuous models with the introduc- 
tion of Boolean network models in the 1960s and the more general so-called logical models 
in the 1980s [15j. Since then other methods have been added, in particular Petri nets 
(see, e.g., [8j) as models for metabolic and molecular regulatory networks. More recently, 
agent-based, or individual-based models, long popular in social science, have been used in- 
creasingly in areas ranging from molecular to population biology. Discrete models such as 
these have many useful features. Qualitative models of molecular networks such as logical 
models, do not require kinetic parameters but can still provide information about network 
dynamics and serve as tools for hypothesis generation. Agent-based models can capture the 
fact that in some biological systems, such as a growing tumor, system dynamics emerges 
from interactions at the local level, such as cell-cell interactions in the case of a tumor. 
Discrete models also tend to be more intuitive than models based on differential equations, 
so they have added appeal for researchers without a strong mathematical background. 

The flip side of the coin is the relative lack of mathematical analysis tools for discrete 
models. While methods like bifurcation, sensitivity, and stability analysis are available for 
differential equations models, the principal tool in the discrete case is simulation. While 
this is very effective for small models, it becomes impossible for larger models, since the size 
of the phase space is exponential in the number of variables in the model. Thus, problems 
like the identification of steady states for a Boolean network model becomes problematic 
once the model contains many more than 20 or 30 nodes, unless one makes use of high 
performance computation capabilities. An added complication is the heterogeneity of the 
different discrete model types so that tools developed for one type are unlikely to apply to 
another one. 

One possible approach to this problem is to find a mathematical framework that is 
general enough so that most or all types of discrete models can be formulated within this 
framework and is rich enough to provide practically useful theoretical and computational 
tools for model analysis. This approach was taken in [16], where it was shown that any 
logical model and any /c-bounded Petri net can be translated into a time-discrete dynamical 
system over a finite state space. The transition function can be described in terms of 
polynomial functions. This makes model analysis amenable to the computational tools and 
theoretical results of computer algebra, a theoretically rich area that has taken advantage 
in the last decade of increasingly powerful symbolic computation capabilities. In this 
setting, the computation of steady states of a model, for instance, turns into the problem 
of solving a system of polynomial equations, for which there are several good software 
implementations. In this paper we show that one can make a similar translation for many 
agent-based models, thereby covering a large part of the discrete models available in the 
literature. One added benefit of this common mathematical framework is that one can use 
it to easily compare models of different types. 
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Many complex biological systems can be modeled effectively as multiagent systems in 
which the constituent entities (agents) interact with each other. For instance, processes 
unfolding in a non-homogeneous spatial environment can be modeled in this way, or pro- 
cesses that are inherently discrete, such as individual immune cells interacting with each 
other in a volume of tissues. Examples include [U [21 [HI [T71 [13] . Often, the models include 
large numbers of agents that can be in one of finitely many different states and interact 
with each other and their environment based on a set of deterministic or stochastic rules. 
The global dynamics emerge from the local interactions among the agents. The advantage 
of increased realism of agent-based models (ABMs) is counter-balanced by the relative lack 
of mathematical tools for their development and analysis. 

One key obstacle is the lack of a formal description of ABMs in a way that makes them 
amenable to mathematical tools for analysis and optimal control. An important step in 
this direction is the work of Grimm et al., which provides a protocol for model specification. 
In [6] the authors point out that agent or individual based models are "more difficult to 
analyze, understand, and communicate" than traditional analytical models because they 
are not "formulated in the general language of mathematics." They proceed to develop 
the so-called ODD protocol for the specification of such models. The basic mathematical 
nature of many ABMs is that of a time-discrete dynamical system on a finite state space 
(either deterministic or stochastic). A state of the model can be specified as a vector of 
values, one for each of the model variables. In addition, a function, either deterministic 
or stochastic, is specified that transforms a given model state into another state. Model 
dynamics is generated by iteration of this function. There are other model types, such 
as discrete event simulations, that can also be viewed in this framework. Even hybrid 
models that contain continuously varying quantities, such as temperature, can sometimes 
be treated as discrete models in practice. The ODD protocol provides essentially a standard 
template for specifying the state space of the model and the update function. 

Little is gained in terms of mathematical power by viewing an ABMs as a function from 
the set of states to itself, without any additional mathematical structure, however. And 
it would still be difficult to verify, in many cases, whether the update function has been 
specified completely and unambiguously, an important aim of the ODD protocol. Both 
problems could be remedied by the introduction of additional mathematical structure that 
provides access to mathematical tools and theoretical results, and which at the same time 
respects the fundamental property of ABMs that global dynamics emerges from local in- 
teractions. And the additional mathematical structure should be "benign," in the sense 
that it introduces few or no mathematical artifacts into the model properties, in particular 
its dynamics. Furthermore, it should be computationally tractable, allowing easy model 
comparison, for instance. The addition of such structure to the ODD protocol in a way that 
takes a model specified in the current ODD protocol and translates it automatically into 
a mathematical object would place little burden on the user, while giving access to math- 
ematical tools. In this paper we propose such a mathematical structure and demonstrate 
its use via some examples. 



3 



A natural way to approximate ABMs by state space models that are grounded in a 
richer mathematical theory and satisfies the constraints discussed above is to construct 
an algebraic model specification, that is, a discrete time, discrete state dynamical system 
whose state space represents exactly the dynamic properties of the ABMs. Algebraic 
models can be described by polynomial functions over finite fields, which provides access to 
the rich algorithmic theory of computer algebra and the theoretical foundation of algebraic 
geometry. 

We demonstrate the added value that is gained from such a mathematical description 
through a collection of examples. The first example illustrates the fact that it is easy to 
check by comparing polynomials whether two different implementations of the same model 
are identical, using a published ABM of butterfiy migration |13) . The second example 
consists of an epidemiological model in the form of a cellular automaton. Here one can 
use theoretical mathematical results to determine all periodic points of the model and 
their period without resorting to simulation. Finally, we show how to compute all steady 
states of Conway's Game of Life using computer algebra algorithms for the solution of 
systems of polynomial equations. In a forthcoming paper we will illustrate the use of 
the mathematical framework proposed here for the purpose of designing optimal control 
methods for agent-based models. 

2 Algebraic Models 

The basic idea underlying our approach is very similar to the idea that allowed geome- 
ters to move from synthetic geometry to analytic geometry, namely the introduction of 
a coordinate system. That is, we need to impose an algebraic structure of addition and 
multiplication on the set of possible states of the model variables, so that we obtain a 
field. (This assumption has long been made in the case of Boolean networks, where the 
choice of underlying field is the Galois field F2 = {0, 1}.) This is possible whenever the 
number of states for a given variable is a power of a prime number. In practice, it is easy 
to accomplish that all variables take values in the same finite field F, either by choosing 
an appropriate number of levels for a given variable at the outset or by introducing dupli- 
cates of one or more states. As with coordinate systems on real-valued spaces, there will 
generally be several different choices that all lead to equivalent outcomes. Once we choose 
such an algebraic structure F, then the set function description of an ABM turns into a 
mapping between vector spaces over the finite field F, which can be described in terms 
of polynomial coordinate functions. We briefly describe this class of dynamical systems 
and then provide a detailed description of how to translate an ABM specified in the ODD 
protocol into such a polynomial dynamical system. 

Let xi, . . . , Xn be a collection of variables, which take values in F. The variables repre- 
sent the entities in the system being modeled and the elements of F represent all possible 
variable states. Each variable Xi has associated to it a "local update function" /j : F*^ — > F, 
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where "local" refers to the fact that fi takes inputs from the variables in the "neighborhood" 
of Xi. Here, "neighborhood" refers to an appropriately defined directed graph encoding the 
variable dependencies. These functions assemble to a dynamical system 

/ = (/!,. ..,/„):F"^F-, 

with the dynamics generated by iteration of /. Iteration can be performed by updating 
the variables synchronously or asynchronously. 

The dynamics of / is usually represented as a directed graph on the vertex set F*^, 
called the state space of /. There is a directed edge from v G F" to u E F" if and only if 
/(v) = u. It is easy to show [12j that any local function fi : F" — > F can be expressed as 
a polynomial in the variables xi, . . . This observation has many useful consequences, 
since polynomial functions have been studied extensively and many analytical tools are 
available. 

We discuss a simple example. Let / : F| — y F| be given by /(xi, X2) = (1 — X1X2, 1 + 
2x2). The state space of / has two components, containing two limit cycles: one of length 
two and one of length three. See Figured] (right). The dependency relations among the 
variables are encoded in the dependency graph in Figured] (left). 




Figure 1: The dependency graph (left) and the state space V{f) (right) of the polynomial 
dynamical system in the above example. 

It is discussed in [11} that the framework of algebraic models is particularly suitable 
for the study of agent-based simulations, since many agent-based simulations naturally 
map to this mathematical setting. Furthermore, it grounds the investigation firmly in the 
mathematical fields of dynamical systems and polynomial algebra, both of which provide 
a rich set of tools and concepts. 

3 Polynomial Form of ODD Models 

In p], the authors propose a standard protocol, named ODD after its key components 
Overview, Design concepts, and Details, for describing individual based and agent-based 
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models. The main motivation was to better enable the communication of such models. 
They state: "We conclude that what we badly need is a standard protocol for describing 
IBMs which combines two elements: (1) a general structure for describing IBMs, thereby 
making a model's description independent of its specific structure, purpose and form of 
implementation [...] and (2) the language of mathematics, thereby clearly separating verbal 
considerations from a mathematical description of the equations, rules, and schedules that 
constitute the model." In this section we address the second element and first review 
the key features of the ODD protocol, using the categories from We will make two 
assumptions on the models this section applies to. 

• All state variables in the model are updated in discrete time steps, either explicitly 
in the model or in the way state variable updates are computed. 

• All state variables can take on only finitely many different states. (This includes 
state variables that include probabilities, etc., since in practice these are represented 
by only finitely many choices.) 

While these assumptions exclude some models, they are satisfied for many ABMs found in 
the literature. We next describe the different parts of an ODD model and how they relate 
to algebraic models. 

3.1 Purpose 

This part contains a verbal description of the process the model is intended to capture and 
the questions one hopes to answer using the model. 

3.2 State Variables and Scales 

The term "state variable" refers to the low-level variables that characterize the low-level 
entities of the model, such as individuals or spatial entities. Another class of variables to be 
considered are aggregated variables such as population size or average food density. These 
auxiliary variables contain information that is deduced from low-level state variables. Thus, 
the state variables represent the fundamental components of the system, the parts whose 
interactions create the whole. Aggregate variables contain information about the system 
by aggregating information about the state variables. State variables can be grouped 
according to type, e.g., individuals, spatial state variables, etc.: 

Xi , . . . , Xn , 2/1 , . . . , y-m: ■ ■ ■ , Zi, . . . , Zj.. 

Each state variable x can take on values from a prescribed set X. For instance, an individual 
could be described by the state vector (age, sex, location), so that X consists of a set of 
triples with a mixture of numerical and symbolic entries. A spatial location y could be 
described by the state vector (number of cars occupying the location, traffic flow), so that 
its state set Y contains a set of pairs with numerical entries. An agent is described by a 
collection of state variables. 
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3.3 Process overview and scheduling 

This part contains a verbal description of the specific processes to be captured by the 
model. The scheduling aspect is very important for our purposes. Is time modeled using 
discrete time steps, or continuous time, or both? Arc there different time scales involved, 
e.g., slow and fast, and which variables fall into which category? What is the update order 
for the different state variables, synchronous or asynchronous, with a fixed schedule or in 
random order? That is, the specification needs to give a complete description of the update 
schedule for all state variables. 

3.4 Design Concepts 

For our purposes, the important aspects addressed here are: 

• Adaptation. Do the state variables change the way they interact with other state 
variables, either individuals or spatial state variables, as a result of changes in their 
environment? What are the environmental variables they sense? Do they have a 
fitness objective that drives adaptation? 

• Interaction. What are the dependencies of the state variables and what are the rules 
for their update? 

• 5tochasticity. Do the state variables follow deterministic or stochastic rules to update 
their state? 

3.5 Input 

It is necessary to specify all necessary inputs defining the state of all state variables and 
for computing a state update for each variable. 

3.6 Submodels 

This part contains a detailed description of model equations and rules, as well as all model 
parameters. It should also include a detailed justification for the choices made. 

Mathematical specification. As Grimm et al. point out, the goal has to be to obtain 
a model description that is complete and as mathematical as possible. We now rephrase 
these features in a more mathematical way. The fundamental components of the model 
are as follows. 

• The state variables. We will denote these by xi, . . . , a;„, without taking into account 
the different groupings based on domain-specific notions such as individual or spatial 
entity, etc. 
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• Each state variable Xi has a set of states Xi that it can be in. Thus, a state of the 
model is given by an element of the Cartesian product X = Xi x • • • x X„. Note 
that for the purpose of mathematical specification it is not important that there are 
different types of state variables. This information is implicit in their set of possible 
states. 

• Each state variable Xi is assigned a finite collection of rules to update its state. At 
each step, each state variable chooses a rule, either deterministically or stochastically, 
which takes as input the states of all or some other state and environmental variables, 
and assigns a new state to Xj. The choice of rule might involve aggregated variables 
and/or random choices. Note that this rule needs to provide complete information 
about how to determine the new state, given any admissible input state of the state 
variable. For instance, a bacterium might have several metabolic modes depending 
on the environment it finds itself in and the density of other bacteria present. The 
update rule chosen will then depend on the relevant environmental variables, and 
possibly others. 

• We are given a complete specification of the order in which state variables are to be 
updated. That is, to compute a new state of the model, we update some variables 
before others, some variables simultaneously, and for some variables we choose a 
random update order. Different time scales can be implemented by updating faster 
variables several times before updating slower ones. 

Observe that each rule for the update of a state variable can be expressed as a function 
fi : X — > Xi. We can now assemble these components to represent the model as a time 
discrete dynamical system 

f = {fi,...Jn):X^X, 

with dynamics generated by iteration. We describe the most general case of models that 
allow state variables to evolve and choose different update rules depending on environmen- 
tal conditions. In this case, each state variable Xi has associated to it a probability space 
Pi of rules/update functions fi : X — > Xi, which represent its different "modes of action," 
depending on the environment. The probability distribution on Pj can be computed with 
information provided as part of the ODD. For instance, a state variable may choose an 
update function based on the state of one or more aggregated variables that describe its 
environment, such as food density, or the states of other state variables in its environment. 
For a given model update, each state variable chooses one update function from this prob- 
ability space. The details of the construction can be found in Appendix 1. The key step 
in the construction is the replacement of each Xi with a finite field F, so that X = 
The end result is that we can now describe the ABM as a dynamical system 

/ = (/i,...,/„):F"^F", 
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with all fi e ¥[ xi,...,Xn] polynomials. So what have we accomplished? Translating 
a model specified in the ODD protocol into a polynomial dynamical system has several 
advantages: 

• Models are stored in a unified mathematical way. 

• Ambiguities in the verbal description and incomplete information can be detected in 
the translation to an equation-based model. 

• It is easy to implement an existing model and modify it. 

• There exists a body of mathematical tools to analyze models, such as computing all 
steady states, which amounts to solving a certain system of polynomial equations. 
Also, there is a framework for optimal control in this context, which we describe in 
a future paper. 

• It is easy to compare models. 

• It is easy to incorporate variables describing the global environment, such as temper- 
ature, market price, pH value, as external parameters into the polynomial functions. 

It is also worth mentioning that the mathematical framework we have created for ABMs 
in this way is "minimal," in the sense that all we have done is to impose a mathematical 
structure on the state space of the model. We have not changed or approximated the rules 
used to update state variables, and we have not changed the way in which the states of 
the model are updated. That is, we still have an exact representation of the ABM, but in 
precise mathematical terms. 

Polynomials are neither intuitive nor are they simple functions. But they provide an 
exact representation of the dynamics of the model that is more compact than the state 
space, which is not feasible to describe for most realistic models. Any computer algebra 
system can be used to analyze a polynomial system, independent of a particular software 
or implementation. The polynomials can be generated in an almost automatic way: we 
provide a simple script to generate the polynomials that interpolates a given truth table 
[9], and tables are easily generated from the description of the model. We illustrate this 
process with an example of a model specified in the ODD protocol, taken from the text 
book [7]. 

4 Examples 

We now show three examples, one of which demonstrates how to translate a model specified 
in the ODD protocol to its algebraic representation, and the other two show how the 
algebraic representation can be used to analyze the global dynamics of the system without 
simulating it. We want to point out that these examples are meant as "proof of concept" 
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illustrative demonstrations. A deeper analysis of each of them to show what the algebraic 
language can and cannot do is beyond the scope of this paper, and for this purpose these 
examples might not be the best choices. The first model was chosen because it is a key 
example in the expository book [7J. The second example affords an easy way to demonstrate 
how one might use theoretical results about algebraic models for the purpose of analyzing 
dynamics of models that are much too large to study thoroughly through simulation. And 
the third example is a widely known cellular automaton that has rarely been studied from 
the point of view of a dynamical system. 

4.1 Prom ODD to algebraic model: Virtual Corridors of Butterflies 

We demonstrate how to formulate the algebraic description for a model given in the ODD 
protocol and show how this process provides guidelines to the modeler for including all 
relevant details and formulating the model such that it is suitable for the purpose it was 
built for. The example we use is a model analyzing the emergence of virtual corridors in 
the movements of butterflies navigating a landscape based on an elevation gradient [13J. 
This model was used in [7] as an example of how to specify an ABM in the ODD protocol. 

We model the "hilltopping" behavior of butterflies, as they try to reach the highest 
point in their spatial environment for mating. The model is initialized with 500 butterflies 
on a landscape discretized into 150 x 150 square patches. A butterfly moves with probability 
q to the highest patch of its 8 surrounding patches, and it moves randomly with probability 
1 — q. Initially, all butterflies start out on the same patch, and simulations are run for 1000 
iterations. 

The purpose, see Section 13.11 of the model is to understand what conditions lead to 
virtual corridors and how the uncertainty of butterflies to sense the elevation gradient 
correctly affects these virtual corridors. This clearly stated purpose forces us to use patches 
as agents, i.e., use a variable Xi for each patch. Butterflies are the "acting agents," so 
they have to be represented as variables, too. Since the butterflies are assumed to be 
homogeneous the state of a butterfly only consists of its position. We enumerate the 
patches so every state corresponds to a different position. Patches differ by their elevation 
(which is fixed during a simulation) and the number of butterflies on them. At the end of 
a simulation, one can compare the number of butterflies on the patches to detect virtual 
corridors. 

4.1.1 Algebraic Model 

Each butterfly is a state variable, xi, . . . ,X5oo! and so is each patch, 2:501, . . . ,X23000) we 
let X denote a state of the system, x = {xi, . . . , 2:23000 )• The state of a butterfly is its 
position (1, . . . 22500), the state of a patch is the number of butterflies on it (0, . . . , 500). 
We enumerate the patches and assume that an elevation map of the modeled area is given. 
Updates are synchronous probabilistic updates with fixed probabilities. From the elevation 
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map we can create 9 different tables: the first table assigns to every patch its most elevated 
neighbor, the remaining 8 tables assign to every patch its north (north-east, east, south- 
east, .... north-west) neighbor. 

Since there are 22500 different states for a butterfly and we want to work over an 
algebraic field, we choose p = 22501, the next highest prime number, as described in 
Appendix 1. The algebraic model is then a system of equations over F2250i- 

The first table has two rows. The two entries in a column, labeled in the first row 
and bi in the second row, for i £ {1, • • • ,22500} indicates that the patch adjacent to 
with the highest elevation is bi. We generate the polynomial gj^i that updates a butterfly 



for J G {1, . . . , 500} in the following way. 

22500 



Note that for all butterflies the polynomials gj^i are the same because all butterflies have 
exactly the same hilltopping strategy and ability. This function is generated as follows. 
The state of a butterfly is equal to the number of the patch the butterfly is on. Therefore 



X, 



is equal to 1, except when Xj 



0. So (1 - {ai 



is equal to 0, unless Xj 



ai, i.e., the butterfly is on patch Oj, then it is 
Oj, in which case it is 1, and we multiply 



by bi- So gj^x) interpolates the first table, it only depends on the state of butterfly xj, 
not on the other butterflies or their distribution over the patches. It is straightforward to 
generate gj^x), . . . ,gj^Q{x) for the remaining 8 tables. If a butterfly detects the correct 
elevation with probability q, then the probabilistic update function for butterfly xj is 



9j,i{x 

9jA{x 
9j,b{x 
9jfi{x 
9j,7{x 

93,^{X 



with probability 
with probability 
with probability 
with probability 
with probability 
with probability 
with probability 
with probability 
with probability 



1^. 



where gj^i interpolates table i. 

The functions for the state of the patches, /j, j G {501, 23000}, are built the 
following way: 



500 



i=l 



As before, each summand is 1 or 0, depending on whether Xi is equal to j or not, respec- 
tively. An increment of 1 is added to the state of patch j whenever a butterfly Xj is in state 
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j, i.e., on patch j. The algebraic system that describes the complete butterfly model is 

f . ip500+22500 , 17^500+22500 
/ • -"^22501 ~^ ^22501 ' 

(Xi, . . . ,X23000) ^ {fl{x), . . . ,/23000(a;)). 

This example demonstrates how to formalize a model given in the ODD protocol as an 
algebraic model, which fully captures all details of the butterfly hilltopping behavior. The 
algebraic framework also provides further advantages to the modeler: algebraic equations 
eliminate any ambiguity inherent in verbal descriptions. The functions are specified for all 
possible cases, some of which are often left out when specifying a function verbally. 

One benefit of this description is the ease of comparison of different model implemen- 
tations. We illustrate this with an example. In the model description in [7], a butterfly 
"moves uphill", i.e., "to the neighbor patch that has the highest elevation," with probabil- 
ity q. But what if a butterfly is already on the highest patch? All neighboring patches have 
a lower elevation, so any movement is a downhill movement. Two different interpretations 
are possible: 

1. If a butterfly is on the highest patch, it stays on the same patch with probability q, 
to not contradict the "move uphill" instruction; 

2. butterflies can always move, even if this means that "move uphill" is actually a 
downhill movement. 

Clearly, these two interpretations lead to different update functions (which may or may 
not affect the qualitative dynamics of the model). The difference in a verbal description or 
implementation might be hard to notice, but it is easy to see the difference by comparing 
the resulting polynomials. Using standard computer algebra tools, the terms of each poly- 
nomial are sorted in a unique way, and the complexity of the comparison of the individual 
terms is linear in the number of terms. Thus, formulated as algebraic models, their differ- 
ence becomes clear, a fact that otherwise might go unnoticed and lead to discrepancies in 
model dynamics. 

Questions about global dynamical behavior can now be formulated in terms of solving 
systems of polynomial equations, which come naturally from the algebraic model. Due 
to the large number of variables involved in the above example, the resulting polynomial 
systems can not be solved easily by direct computation on a standard desktop computer 
with today's methods and software. However, we believe that with the advancement of 
algorithms and computational techniques, it will very soon be possible to analyze such 
models with symbolic computation techniques. 

Currently, there exist no feasible mathematical methods to analyze the dynamics of 
general large stochastic models without iterating over the entire state space, i.e., simulat- 
ing the system, in whatever format. However, it is possible in many cases to approximate 
a stochastic model with a deterministic system that captures the main dynamical fea- 
tures. Next we show two examples where mathematical theory can be used to analyze the 
dynamics of agent based systems. 
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4.2 Simple Infection Model 



Consider the following model for the spread of an infection: agents are cells on a square 
grid, every cell can be healthy or infected. Healthy cells are depicted in white, infected 
cells in black. Figure 2(a) shows the layout of the grid with a cell and its four neighbors. 
Here, we are considering the so-called von Neumann neighborhood of a cell. 




(a) Grid with a cell (black) 
and its 4 neighbors 



(b) Random infection of cells and the grid after one 
iteration 



Figure 2: Infection model 



The systems evolves according to the following rules: A cell acquires a healthy state, if 
its four neighbors are healthy, otherwise infected. Figure [2(b)| shows a randomly infected 
grid and its state after one iteration. 



Figure 3: Dependency graph for infection model 
This agent based model can easily be translated into an algebraic model. Variables 
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represent the cells, healthy cells have state ON (1), infected cells are OFF (0). Since there 
are two states for each variable, we choose F2 as base field. The update rule is homogeneous 
for all cells, and for a cell x with neighbors yi,y2,y3, and it is given by 

/x(yi,y2,y3,y4) = ^12/2^32/4- 

One easily sees that fx = 0, i.e., infected, unless all four neighbors are healthy (1). The 
resulting polynomial dynamical system is a so-called conjunctive Boolean network, as de- 
scribed in |10] . The dependency graph of this network is shown in Figure [3] and clearly 
consists of one strongly connected component, that is, any node can be reached from any 
other node by a directed path. We briefly describe one of the results in [10] that applies 
here. 

The loop number of a directed, strongly connected graph is defined as follows: choose 
a vertex (the loop number is independent of the vertex chosen) and consider all directed 
loops at this vertex. The loop number is the greatest common divisor of the lengths 
(number of edges) of all such loops [10]. It is easily seen that the infection model has loop 
number 2. Theorem 3.8 in [TO] states that the length of any limit cycle of the system has 
to divide the loop number, so that there are only steady states and limit cycles of period 
2. Furthermore, the theorem gives an example formula for the number of limit cycles of 
a given length according to which the system has exactly two steady states and one limit 
cycle of length 2, and no other limit cycles. The two states of periodicity 2 are shown in 
figure m By the nature of the update rule the disease is fast spreading, a single infected 
neighbor is sufficient for a cell to be infected. It is interesting and counterintuitive that 

there is a limit cycle of length 2 with only half the cells infected. The basin of attraction 

^2 

of this cycle for an n x n grid is of size 2~~^^ — 2, any combination of at least half the cells 
being healthy and arranged as in Figure [H There are two obvious steady states, given by 
all cells healthy or all cells infected. Although it might be easy to find the two steady states 
and the limit cycle by studying the agent based model and not its algebraic representation, 
one would still need to prove that these are the only fixed points and that there is exactly 
one limit cycle of length 2, and no limit cycles of greater length. Finding the fixed points 
for such a system by simulation is computationally not feasible: on a 500 by 500 grid, there 
are 2"^^^^^^ different states in the state space that one would have to test. 

The beauty of this algebraic model is, that the theorem in [10] applies to an arbitrarily 
large finite grid. An elegant way of dealing with boundary cells is to project the grid onto 
a torus by connecting the cells on the right edge to their counterparts on the left edge, and 
similarly for the top and bottom cells. 

4.3 Conway's Game of Life 

Our third example is Conway's Game of Life [3], a 2-dimensional cellular automaton (CA), 
using the 8 neighbors of a cell, that is, the Moore neighborhood of a cell, to compute the 
next state. While most ABMs are not in the form of a cellular automaton, Conway's Game 
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Figure 4: Two states of periodicity 2 



of Life has some of the same characteristics as many ABMs. It is also interesting for our 
purpose in that it poses an interesting computational challenge for our framework. The 
rules of this CA have to cover many cases so that the polynomials expressing the rules are 
very dense, containing almost all possible terms. This affects the computational complexity 
of the algorithms significantly. 

Cells are in one of two states, either LIVE (1) or DEAD (0), with the following behav- 
ioral rules: 

1. Any live cell with fewer than two live neighbors dies, as if caused by underpopulation. 

2. Any live cell with more than three live neighbors dies, as if by overcrowding. 

3. Any live cell with two or three live neighbors lives on to the next generation. 

4. Any dead cell with exactly three live neighbors becomes a live cell. 

Although the dynamics of the system are determined by very simple rules, the emerging 
patterns are fascinating and have been studied extensively. Questions that are natural to 
ask are what steady states or oscillators can occur. We will show how to answer these 
questions by using an algebraic model of the Game of Life. 

Naturally, the variables or agents in this system are the cells. There are only 2 possible 
states for an agent, DEAD or ALIVE. Therefore, we can describe its behavior with poly- 
nomials over F2. Every agent x has 8 neighbors, xi, . . . , xg- The function fx that describes 
the transition of agent x, is 



fx{x,xi,. . . ,xs) 



< 2 

^ Xj = 2 and x 
^ Xj = 2 and x 
Y^Xi = 3 
Y^Xi>3 




1 



For function fx in polynomial form, see Appendix [Bl The algebraic model for the Game 
of Life is a system 

/ ~ (/l )•••;/« 



: F^^" 
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' ^ /i (^1 ; • • • ) 2^nxn) ) 

where n is the dimension of the square grid. To calculate all the fixed points, i.e., still lives 
of this system, we solve the system of polynomial equations: 

fi{x) — Xi = 0, i = 1 . . . n X n. 

One can use a computer algebra system like Macaulay2 0] to compute a Grobner basis of 
the ideal that is generated by the equations for the fixed points of the system, from which 
one obtains the fixed points. 

For example, on a 4 x 4 grid with periodic boundary conditions, the fixed points of the 
model are the solutions to the system 

fi{x) =Xl 
fwix) = Xi6. 

First we compute a Groebner basis in lexicographic order for / = — xi, . . . , fie{x) — 

xiq) over the quotient ring F2[xi, . . . ,xiq]/J, where J = {xf — xi, . . . ,xfg — xiq). From a 
factorization of the Grobner basis, one can easily read off the solutions. There are 53 fixed 
points. 

One should remark, that all fixed points can easily be found by updating every possible 
initialization and checking whether it is a fixed point. For a 4 x 4 grid, there are only 
2^^ = 65536 states, so finding all fixed points is no computational challenge that would 
require computer algebra. As the grid increases, the complexity of this brute force approach 
increases exponentially, whereas the number of variables and equations increases linearly 
in the algebraic model. For example, for a 10 x 10 grid one has to check 2^'''^ states. 
With the algebraic model though, one has to compute a Grobner basis in a ring with 100 
indeterminates for an ideal with 100 generators. Depending on the polynomials describing 
the update rules, this is a fast computation. Admittedly, for the functions that describe 
the Game of Life, we were not able to compute the solutions for a system with more 
than 16 variables, neither with Macaulay2 [4] nor Singular [5]. However, with the rapid 
improvement of hardware and computer algebra software, solving the polynomial system 
to analyze the dynamics will soon become feasible for larger grid sizes and also faster than 
simulating every state which requires exhaustive enumeration of the state space. 

5 Discussion 

At this time there is no broadly agreed-upon mathematical framework which can serve as 
a standard for the specification and analysis of agent-based models and which preserves 
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the key feature of this model type that global dynamics emerges from local interactions. 
In this paper we propose such a framework, which preserves all features of agent-based 
models and provides access to mathematical analysis tools. It is conceived as an extra step 
in the framework of the ODD protocol, which represents a first step toward a standardized 
protocol for the specification of agent-based models. The extra step can be automated, 
so that users do not need familiarity with the underlying mathematical concepts. The 
mathematical framework is that of polynomial dynamical systems over a finite field, which 
provides access to theoretical and computational tools from computer algebra and discrete 
mathematics. 

We have presented examples of how this extra step of model specification works in 
practice, and we have presented examples of how the mathematical specification provides 
added value by allowing access to theoretical and computational tools for model analysis. 
We emphasize that algebraic model specification is an addition to the ODD protocol, not 
a replacement. The model must be explained in ODD to be understood by others. An 
algebraic system is an additional resource that can be used to distribute and re-use the 
model. It eliminates any ambiguity created by a verbal description, and it is a compact 
format that can run on any system independent of software implementation, so parameters 
and rules are easily modified for further simulations. The algebraic representation allows 
easy comparison between two models. The rigorous mathematical language is another 
advantage of the framework, rich algorithmic theory from computer algebra and the theo- 
retical foundation of algebraic geometry are available to analyze algebraic models. We are 
making available a basic tool to automatically generate a polynomial from data in ODD 
[9] to ease the process of creating an algebraic model. Because of its many advantages, we 
hope that modelers will extend their model description to the algebraic description and 
store their models in a central location so that models can easily be found and re-used. 
The polynomial systems framework unifies the representation of three important discrete 
model types: agent-based models, logical models (including Boolean network models), and 
Petri net models, allowing direct comparison of different models. 

The translation algorithm presented in this paper only applies to deterministic agent- 
based models. And the algorithms in |16j also deal only with deterministic models. How- 
ever, the majority of published ABMs in biology are stochastic. Fortunately, there are 
stochastic versions of several of these model types available on the mathematical side, that 
can be used for the purpose of modeling stochastic ABMs. The most suitable model type is 
that of probabilistic Boolean networks [?], a multi-state generalization of Boolean networks 
that is stochastic in two ways: each node of the network has attached a probability space 
of update functions rather than a single function and, secondly, the update order can be 
stochastic. It is easy to represent these models in the polynomial dynamical system frame- 
work, which we have done as part of our software package ADAM (Analysis of Dynamic 
Algebraic Models), available at http://adam.vbi.vt. edu| as a web service. 

Finally, we comment on the computational complexity of the analysis of agent-based 
models by the methods proposed in this paper. While a significant number of published 
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ABMs are well within the computational reach of our methods, there are many ABMs 
that completely overwhelm them. This is of course similar to the situation for continuous 
models and parameter estimation methods, bifurcation analysis, etc. There too, models are 
becoming too large to be analyzed by anything other than more or less through simulation. 
For some ABMs even simulation represents a challenge because of their size. In these cases 
new methods for model reduction are the only viable approach to a mathematical analysis, 
no matter what methods are available. For instance, in the case of a multi-scale ABM one 
possible approach might be to construct phenomenological models for the higher scales that 
accurately model the aggregate dynamics of the lower scales without explicitly representing 
these. This is the subject of ongoing work by the authors. 
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A Translation from ODD to polynomial model 



Here we describe the details of constructing a polynomial model from ABM information 
stored in the ODD protocol format. 

A.l Update schedule 

The scheduling information provided allows the assembly of a complete update order for 
all the state variables that need to be updated, possibly involving a mixture of sequen- 
tial, parallel, deterministic and random updates of subgroups of the state variables. The 
scheduling information can be assembled to a probability space P, which has as elements 
the set of words in the letters ui, . . . , Un] vi, . . . ,Vn- Each word UiUjUk ■ ■ ■ VaVb ■ ■ - Uc - ■ ■ 
translates into an update order XiXjXk • • • {xaXh ■ ■ ■ )xc • • • , which is to be interpreted as 
updating first Xi,Xj,Xk, ■ ■ ■ sequentially in this order, then updating Xa, Xb, Xc, ■ ■ ■ in paral- 
lel, then update Xc, ■ ■ ■ sequentially in this order, etc. The probability distribution on the 
space P can be computed from the scheduling information which indicates the variables 
to be updated randomly and with which probability distribution. The resulting dynamical 
system will be denoted as 

f={Pi,...,Pn;P):X^X. 

At this point we have described the dynamical system as a set function, which is rather 
limiting. For instance, if we want to compare whether two such models are identical, 
we need to evaluate both at all possible inputs and compare the outputs. It would be 
useful if we could describe the function / via equations. Now, sometimes, the model 
specification will already be given to us in the form of equations or mathematical functions. 
We now describe a procedure that allows us to represent / by mathematical functions of 
a unified type, in all circumstances. This procedure is analogous to the introduction of a 
Cartesian coordinate system to transition from synthetic geometry to analytic geometry. 
The fundamental observation is the following. 

A. 1.1 Observation 

Let F be a finite field, such as Z/p and let / : F" — F be any function. Then there exists 
a unique polynomial 5 G F[xi, . . . , x„], such that each variable in g appears to a power less 
than |F|, and /(ai, . . . , an) = g{ai, . . . , a„) for all (ai, . . . , a„) € F". 

Thus, in the case that all state variables take on states in the same state set, which 
furthermore can be given the structure of a finite field, then the dynamical system / : 
X — X above can be described via polynomials. This is in fact always possible, and 
we briefiy outline the process. It is similar to a construction described in detail in [16j. 
There, we show how to translate a so-called logical model into a polynomial dynamical 
system. Logical models are specified in a way that is very similar to the rules for state 
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variables in an ABM. First consider one state variable and suppose that one of its states 
is described by an r-tuple from a set Xi x • • • x X^. In each component Xj we choose an 
element and duplicate it enough times so that the number of elements in Xi becomes equal 
to the smallest prime number that is greater than or equal to the orders of all the Xj. If 
Xi contains ordered numerical values, for instance, say Xi = {1,2,3,4}, then we can add 
a state 4' to obtain a set with 5 elements. After carrying this construction out for all Xi 
we have obtained a set Xi x • • • x Xr in which each component has the same number of 
elements, equal to the power of a prime number p. It is now straightforward to see that 
one can endow this set with a field structure so that it becomes isomorphic to the Galois 
field ¥pr. A similar construction can now be carried out to assure that the order is the 
same for the Galois fields for all state variables. The end result is that all state variables 
take values in the same finite field F. The last step is to extend the update functions fi 
for each state variable to the larger state set. This is done by assigning the same value to 
a new state as the state that was duplicated to obtain it. 

B Behavioral Rule in Polynomial Form 

For a 4 by 4 grid, we obtain for agent xi with neighbors X2 ■ ■ ■ ,xg the following polynomial: 

/ = Xi*X2*Xs*X4*X5*XQ*XY*Xs+Xi*X2*X3*X4*X5*XG*X'r*Xg+Xi*X2*Xs*X4*X5*XQ*X8* 

Xg + Xi*X2*X3*X4^*X5*Xl*X8*Xg+Xi*X2*X^*X4^*XQ*Xi*Xg,*XQ+Xi*X2*Xs*X^*XQ*Xj*Xs* 
XQ + Xi*X2*X4*X5*XQ*X-r*Xs*Xg + Xi*Xs*X4*Xrj*XG*X7*X8*Xg+Xi*X2*X3*X4*X5*XQ*Xr + 
Xi*X2*X3*X4,*X5*Xq*X8+Xi*X2*X3*X4,*X5*X7*Xs+Xi*X2*X3*X4,*Xq*X7*X8+Xi*X2*X3*X5* 
Xg*Xt*Xs+Xi*X2*X4*X5*Xq*Xy*Xs+Xi*Xs*X4^*X5*Xg*Xy*Xs+X2*Xs*X4*X5*Xq*Xy*Xs + 
Xi*X2*X3*X4*X5*XQ*Xg+Xi*X2*X3*X4*X^*XY*Xg+Xi*X2*X3*X4*XQ*Xj*Xg+Xi*X2*X3* 
X5*XQ*X-r*Xg + Xi*X2*X4*Xrj*XQ*XT*XQ+Xi*X3*X4*X5*XQ*X7*Xg+X2*X3*X4*X5*XQ*Xj* 
Xg + Xi*X2*X3*X4*X5*X8*Xg + Xi*X2*X3*X4*XQ*X8*Xg + Xi*X2*X3*X5*XQ*X8*Xg + Xi*X2* 
X4*X5*XQ*Xs*Xg+Xi*X3*X4*X5*XG*Xs*Xg+X2*X3*X4*X5*XQ*Xs*Xg+Xi*X2*X3*X4*X'7* 
Xs*Xg + Xi*X2*X3*X5*X7*X8*Xg + Xi*X2*X4*X5*X-r*Xs*Xg + Xi*X3*X4*X5*Xl*Xs*Xg+X2* 
X3*X4*X5*Xj*Xs*Xg+Xi*X2*X3*XQ*Xj*Xs*Xg + Xi*X2*X4*XQ*Xj*Xs*Xg+Xi*X3*X4*XQ* 
Xr*Xs*Xg + X2*X3*X4*XQ*X7*X8*Xg + Xi*X2*X5*XQ*Xj*X8*Xg + Xi*X3*X5*XQ*Xj*X8*Xg + 
X2*X3*X5*XG*X'7*Xs*Xg+Xi*X4*X5*XQ*XT*Xs*Xg+X2*X4*X5*XG*X'7*Xs*Xg+X3*X4*X5* 
XQ*Xi*Xs*Xg + Xi*X2*X3*X4+Xi*X2*X3*X5+Xi*X2*X4*X5 + Xi*X3*X4*X^ + Xi*X2*X3*XQ + 
Xi*X2*X4*XQ + Xi*X3*X4*X(] + Xi*X2*X5*XQ+Xi*X3*Xrj*XQ + Xi*X4*X5*XQ + Xi*X2*X3*Xj + 
Xi*X2*X4*X-j + Xi*X3*X4*Xj+Xi*X2*X5*X7+Xi*X3*X5*Xj + Xi*X4*X5*X-j + Xi*X2*XQ*Xl + 
Xi*X3*XQ*XY+Xi*X4*XG*XY+Xi*X5*XG*X'r+Xi*X2*X3*Xs+Xi*X2*X4*Xs+Xi*X3*X4*Xs + 
Xi*X2*Xrj*Xs + Xi*X3*X5*X8+Xi*X4*X5*Xs+Xi*X2*XQ*Xs + Xi*X3*XQ*X8 + Xi*X4*XQ*Xs + 
Xi*X5*Xq*X8 + Xi*X2*Xj*X8+Xi*X3*Xj*X8+Xi*X4*X7*X8 + Xi*X5*Xt*X8 + Xi*Xq*X7*X8 + 

xi*a;2*X3*a;9+xi*X2*a;4*X9+xi*X3*X4*X9+xi*X2*a;5*X9+xi*a;3*X5*X9+xi*X4*X5*X9+ 

Xi*X2*XQ*Xg+Xi*X3*XG*Xg+Xi*X4*Xe*Xg+Xi*X5*XQ*Xg+Xi*X2*X7*Xg+Xi*X3*XT*Xg + 
Xi*X4*X7*Xg+Xi*X5*XT*Xg+Xi*XQ*XY*Xg+Xi*X2*X8*Xg+Xi*X3*X8*Xg+Xi*X4*X8*Xg + 
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Xi*X5*Xs*X9+Xi*XQ*Xs*Xg+Xi*XT*Xs*Xg+Xi*X2*X3+Xi*X2*X4,+Xi*X3*X4+X2*X3*X4 + 
Xi*X2*X5+Xi*X3*X5+X2*X3*X5+Xi*X4*X5+X2*X4*X5+X3*X4*X5+Xi*X2*Xq+Xi*X3*Xq + 

X2*X3*XQ+Xi*X4*XQ+X2*X4*XQ+X3*X4*XQ+Xi*X5*XQ+X2*Xr^*Xc,+Xs*Xr^*XQ + X4*X^*XQ + 
Xi*X2*X7+Xi*X3*X7+X2*X3*X7+Xi*X4*XT+X2*X4*XY+X3*X4*X7+Xi*X5*Xi+X2*X5*Xy + 
Xs*X5*X7+X4*X5*X7+Xi*XQ*XT+X2*XQ*Xy+Xs*XQ*Xy+X4,*XQ*X7+X5*XQ*X7+Xi*X2*Xs + 
Xl*X3*Xs+X2*X3*X8+Xl*X4,*X8+X2*X4*X8+X3*X4ifX8+Xl*X5*Xs+X2*X5*X8+X3*X5*X8 + 
X4*Xr,*X8+Xi*XQ*X8+X2*XQ*X8 + X3*XQ*X8 + X4*XQ*X8+X^*XQ*X8+Xi*X7*X8 + X2*X7*X8 + 
X3*X7*Xg+X/i*X7*Xg+Xrj*X7*Xs + XQ*X7*Xs + Xi*X2*X()+Xi*X-i*XQ+X2*X-i*XQ + Xi*X/i*XQ + 
X2*X4*XQ'j-Xs*X4*XQ+Xi*X5*XQ~\"X2*X5*XQ + X3*X5*XQ+X4*X5*XQ+Xi*XQ*Xg + X2*XQ*XQ + 
X2*XQ*XQ+X4*XQ*XQ+X5*XQifXg+Xi*X7*XQ+X2*X7*Xg+X3*X7*XQ+X4*X7*XQ+X5*X7*Xg + 
XQ*X7*Xg+Xi*X8*Xg+X2*X8*Xg+X2*X8*Xg+X4*X8*Xg+X5*X8*Xg+XQ*X8*Xg+X7*X8*Xg. 
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